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Ahstract-\n  this  article  we  present  a  biologically  inspired  motor 
control  scheme  based  on  sensory-motor  interaction  modalities 
within  the  Central  Nervous  System,  and  its  application  to  the 
control  of  a  single  joint  limb  segment  actuated  by  two  pneumatic 
McKibben  muscles.  The  embedded  Artificial  Neural  Network 
(ANN)  module's  architecture,  whose  functioning  is  regulated  by 
reinforcement  learning,  is  similar  to  the  connectivity  of  cerebel¬ 
lar  cortex.  Various  biologically  plausible  learning  schemes, 
which  enable  functional  plasticity  in  the  cerebellar  cortex, 
are  discussed.  The  simulation  and  experimental  results  are  then 
reported. 

Keywords  -  Motor  control,  brain  models,  artificial  neural  net¬ 
works,  reinforcement  learning,  artificial  muscles. 

L  Introduction 

This  article  addresses  the  problem  of  how  the  cerebellum 
processes  premotor  orders  so  that  the  fast  movements,  which 
have  a  shorter  duration  than  the  sum  of  the  transmission  and 
processing  delays  in  the  motor  and  sensory  pathways,  can  be 
accurate.  By  definition,  fast  movements  cannot  be  regulated 
in  closed-loop  using  the  sensory  signals,  but  must  be  driven 
in  open-loop,  by  motor  orders  that  must  be  computed  as  the 
movement  proceeds  and  take  into  account  the  dynamical  and 
geometrical  characteristics  of  the  limb  to  be  moved.  This  can 
be  mathematically  stated  as  the  problem  of  inverting  the  bio¬ 
mechanical  function  of  the  limb. 

In  previous  articles  [1,2,3],  we  presented  a  circuit  that  al¬ 
lows  estimation  of  an  inverse  function  by  avoiding  an  explicit 
inversion  operation.  The  model  was  first  tested  by  simula¬ 
tions  of  eye  and  forearm  movements  [4].  According  to  an 
anatomical  interpretation,  the  predictive  elements  of  the 
model  would  be  embedded  in  the  cerebellar  cortex,  and  the 
function  of  the  whole  Cerebellum  would  be  to  compute  ap¬ 
proximate  inversions.  Therefore,  the  second  step  consisted  of 
replacing  the  elements  which  are  interpreted  as  parts  of  the 
cerebellar  cortex  by  an  ANN  architecture  whose  blueprint  is 
copied  from  the  well-known  connectivity  of  the  cerebellar 
cortex  [5,6]. 

The  present  work  is  intended  first  to  equip  the  model  with 
a  circuit  that  would  represent  the  inferior  olive  and  then  to 
apply  various  learning  mechanisms  that  differ  from  each 
other  with  the  inclusion  of  different  parts  of  the  cerebellar 
cortex  in  the  calculation  of  an  error  signal  on  the  inferior  ol¬ 
ive.  A  single  joint  robot  limb  actuated  by  two  artificial 
McKibben  pneumatic  muscles  [7]  was  chosen  to  mimic  the 
human  forearm  movements.  The  expanded  model  was  trained 
until  the  desired  movement  was  accurately  performed  both  by 
the  simulations  and  by  replacing  the  peripheral  part  with  the 
real  robot  limb  in  order  to  realize  real  time  learning  [8,9]. 


II.  Model  of  motor  control 

A.  General  circuit 

The  general  circuit  is  shown  in  Fig.  lA.  Blocks  noted  gi 
and  g2  represent  the  bio-mechanical  functions  of  two  antago¬ 
nist  muscles  actuating  a  limb  segment  whose  dynamics  is 
represented  by  G.  The  elements  encircled  with  a  rectangle 
drawn  in  dashed  lines  are  interpreted  as  representing  the 
cerebellar  cortex  which  would  be  able  to  compute  the  esti¬ 
mates  g*  ,  G*  of  kinematic  variables  g  and  G  before  the 
movement  is  launched  but  after  the  motor  transmission  delays 
Ai  and  A2.  The  summing  elements  from  which  issue  Qi  and 
Q2  are  interpreted  as  representing  the  cerebellar  nuclei.  The 
regulating  pathways  via  the  Inferior  Olive  are  drawn  here  in 
dashed  lines  to  recall  that  no  learning  takes  place  in  the  initial 
cybernetic  circuit.  The  input  signal  being  the  premotor  ve¬ 
locity  profile  0^  of  the  desired  movement,  this  circuit  is  able 
to  compute  the  actual  position  0^  [4]. 


Fig.l.  Cybernetic  circuits  proposed  for  motor  control  of  two  antagonist  mus¬ 
cles.  Empty  arrows  represent  positive  (excitatory)  connections  and  black 
dots  negative  (inhibitory)  connections.  (A)  The  initial  cybernetic  circuit,  (B) 
the  circuit  equipped  by  the  ANN  architecture  and  the  Inferior  Olive. 
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B.  Embedding  the  ANN  in  the  control  circuit 

Fig.  IB  represents  a  modified  version  of  the  initial  circuit 
having  the  learning  ability  thanks  to  the  embedded  ANN.  For 
the  sake  of  biological  plausibility,  the  architecture  of  the 
ANN  representing  the  cerebellar  cortex  was  chosen  to  be 
similar  to  the  anatomical  connectivity  [5]  as  shown  in  Fig.  2. 


Treatment  Stage  Choice  Stage 


Fig. 2.  The  ANN's  architecture  representing  the  cerebellar  cortex.  Adaptive 
weights  are  noted  CD  and  symbolized  by  black  triangles  while  the  transmis¬ 
sion  delays  are  noted  5.  Stellate  and  Basket  cells  are  not  represented  and  the 
ratios  of  various  cell  types  were  not  respected. 

This  ANN  whose  functioning  is  detailed  in  [8,9]  is  ex¬ 
pected  to  learn  how  to  estimate  an  internal  representation  of 
the  peripheral  part's  dynamics  (i.e.  the  experimental  appara¬ 
tus),  under  supervision  of  a  teaching  signal. 

C.  Learning  driven  by  a  teaching  signal  issued  from  inferior  olive 

According  to  the  proposed  anatomical  interpretation,  this 
signal  would  represent  climbing  fiber  activity  issued  from  the 
inferior  olive  which  is  modeled  in  this  work  as  shown  in 
Fig. 3  to  detect  over-  or  under-shoots  of  movements  and  cor¬ 
rect  ongoing  movements.  The  teaching  signal  sent  to  the 
neural  network  (Fig.  2)  is  a  square  pulse  of  fixed  unit  ampli¬ 
tude  and  of  5  ms  duration,  which  thus  did  not  encode  the  am¬ 
plitude  of  the  error. 


Fig. 3.  Circuit  representing  the  inferior  olive. 

To  keep  information  about  synapse  activity  during  the- 
movement  (credit  assignment  problem),  a  synaptic  eligibility 
is  calculated  by  a  first-order  low-pass  filter  as  in  (1). 


'^i—  +  ei(t)  =  gri{t)  (1) 

at 

Then  the  learning  rule  becomes: 

Aco  p  (/)  =  -ri  •  FG(t)  •  Cl  it)  (2) 

where  r|  is  a  small  and  positive  learning  rate  and  FG(t)  is  the 
teaching  signal  carried  by  the  climbing  fibers  and  calculated 
according  to  the  four  following  conditions: 

•  Condition  A:  The  error  calculation  takes  only  into 
account  the  difference  between  the  desired  and  achieved 
final  positions. 

•Condition  B:  The  error  calculation  takes  into  ac¬ 
count  both  the  differences  in  position  and  in  velocity.  To 
represent  the  intermittent  control  by  the  inferior  olive, 
the  teaching  signal  was  kept  silent  during  a  latency  delay 
of  400  ms  after  each  launch  of  the  arm. 

•  Condition  C:  The  error  calculation  is  similar  to  that 
of  condition  B,  and  in  addition  the  short-term  reduction 
in  Purkinje  cell  activity  following  the  firing  of  the 
climbing  fiber  is  also  taken  into  account.  The  amplitude 
of  the  ANN's  output  signal  is  multiplied  by  0.9  during 
the  first  30  ms  after  the  occurrence  of  the  teaching  signal. 

•  Condition  D:  The  error  calculation  is  similar  to  that 
of  condition  C,  and  in  addition  the  values  of  the  signals 
Q  issued  from  the  summing  elements  interpreted  as  the 
cerebellar  nuclei  are  reduced.  This  was  aimed  at 
coarsely  modeling  minimization  of  energy  expenditure, 
since  signals  Q  encode  partly  the  forces  to  be  exerted  by 
the  muscles.  A  teaching  signal  was  applied  whenever  the 
values  of  Q  exceeded  a  threshold  empirically  set,  after 
several  trials,  to  a  value  of  10  (arbitrary  units). 

D.  Actuator  of  two  antagonistic  McKibben  muscles 

An  artificial  McKibben  pneumatic  muscle,  consisting  of  a 
braided  shell  surrounding  a  rubber  inner  tube,  is  defined  by 
its  length  (Iq),  mean  radius  (ro)  and  braid  angle  (0(o)  when  it  is 
not  under  pressure.  When  compressed  air  is  blown  in,  the 
muscle  contracts  to  generate  an  axial  contraction  force  which 
can  be  described  in  terms  of  the  contraction  rate  (8)  [7].  The 
initial  circuit  was  finally  modified  so  as  to  drive  the  experi¬ 
mental  apparatus  shown  in  Fig.4  which  consists  of  a  me¬ 
chanical  segment  actuated  by  two  artificial  McKibben  mus¬ 
cles  whose  dynamic  behavior  can  be  represented  by  (3)  and 
the  values  reported  in  Table  1. 


Fig.4.  Experimental  apparatus  with  two  McKibben  muscles. 
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y.^  =  9.8iJf[F  -F  ] 


where: 
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J  =  Momentof  inertia  of  the  rod 
R  =  Radius  of  the  pulley 

0  =  Angular  positioncountedfrom  the  rest  position 

Po  =  Initial  air  pressure 
APj  =  Air  pressure  difference 
C  =  Coefficieit  of  the  viscosity 


Table  1:  Simulation  values 


Parameter 

Value 

Parameter 

Value 

lo 

0.3m 

Po 

2.5  bars 

R 

0.015m 

£0 

0.1 

J 

0.05  kg.m^ 

£max 

0.25 

C 

500  N.s 

fnax 

20  kgf/bars 

III.  RESULTS 


The  training  of  the  model  was  performed  in  two  ways. 
First  the  forearm  movements  were  simulated  by  replacing  the 
peripheral  system  with  the  model  of  the  robot  limb  segment 
shown  in  Fig.  4.  Then,  the  experimental  apparatus  was  used 
to  mimic  by  itself  the  same  movements.  The  results  obtained 
by  performing  four  different  learning  schemes,  A  to  D,  in 
either  ways  are  shown  in  Fig.  5  and  Fig.  6  respectively. 


Error  evolution  (Type=A.  B.  C.  D) 


Performance  on  positioning  (Type=D) 


Fig.  6.  Model's  learning  capacity  on  real  robot  limb  movements. 


Error  evolution  (Type=A.  B.  C,  D) 


Performance  on  positioning  (Type=D) 


Fig.  5.  Model's  learning  capacity  on  simulated  robot  limb  movements. 


Curves  on  the  top  of  the  Fig.  5  and  6  show  the  evolution  of 
the  mean  squared  error  as  learning  proceeds,  while  the  curves 
on  the  bottom  show  the  results  of  the  set  of  10  movements 
performed  after  learning  of  type  D.  Solid  and  dashed  lines 
represent  respectively  desired  and  actual  movements.  The 
ANN  was  composed  of  twenty  granular  cells  with  time  con¬ 
stants  randomly  set  in  the  range  1-6  ms,  one  Golgi  cell  and 
one  Purkinje  cell  with  time  constants  of  10  ms.  Transmission 
delays  8  between  cells  were  randomly  set  in  the  range  1- 
10  ms.  The  motor  delays  Ai  and  A2  were  set  to  50  ms.  The 
learning  phase  proceeded  in  300  iterations  on  a  predefined  set 
of  5  positive  (counterclockwise)  and  5  negative  (clockwise) 
horizontal  movements,  each  being  presented  alternately  at  a 
time.  Each  movement  lasted  5  seconds  and  all  velocity  pro¬ 
files  were  centered  on  2.5s.  The  amplitude  range  was  ± 
40  degrees. 

The  change  in  the  calculation  of  the  teaching  signal  con¬ 
siderably  affects  the  number  of  iterations  needed  for  achiev¬ 
ing  the  desired  movement  and  the  best  result  is  obtained  with 
a  learning  scheme  of  type  D  in  both  testing  cases.  Some  in¬ 
ternal  signals  measured  on  different  sides  of  the  circuits  num¬ 
bered  respectively  1  and  2  during  two  movements  of  same 
amplitude  but  with  different  velocities  are  illustrated  in  fig¬ 
ures  7  and  8.  On  each  figure,  from  left  to  right  and  from  top 
to  bottom,  the  curves  in  the  rectangles  show  desired  (dashed 
lines)  and  achieved  (solid  lines)  angular  displacements,  time- 
courses  of  signals  issued  from  Purkinje  cells,  motor  orders 
and  finally  the  force  profiles  measured  by  force  sensors. 
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Angular  Displacement  Purkinje  cells'  outputs 
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Fig.  7.  Internal  signals,  during  a  slow  movement.  Except  for  the  top  right 
corner,  the  curves  plotted  in  dotted  and  solid  lines  represent  the  signals 
measured  on  side  1  and  2  of  the  circuit  shown  in  Fig.  IB  respectively. 
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IV.  Conclusions 

This  model  of  the  cerebellar  pathways,  although  quite  sim¬ 
plified,  suffices  to  control  movements  of  a  single  segment, 
actuated  by  means  of  artificial  muscles  endowed  with  non¬ 
linear  characteristics.  Its  connectivity  accounts  essentially  for 
the  divergence  of  information  carried  by  the  mossy  fibers  to 
the  granule  cells  and  then  the  convergence  of  the  latters'  out¬ 
puts  conveyed  by  parallel  fibers  within  the  dendritic  arbori¬ 
zations  of  the  Purkinje  cells.  After  learning,  the  output  signal 
of  the  ANN  (i.e.  Purkinje  cells  whose  axons  convey  the  out¬ 
put  signal  of  the  cerebellar  cortex  to  the  cerebellar  nuclei) 
anticipates  the  velocity  of  the  movement  achieved  by  the  mo¬ 
bile  segment.  As  a  consequence,  the  motor  orders  which  re¬ 
sult  from  the  addition  of  the  various  premotor  orders  in  the 
motoneurons  can  be  considered  as  encoding  a  ‘Virtual  tra¬ 
jectory”,  which  is  not  calculated  on  purpose  since  it  results 
from  the  functioning  of  feedback  loops  consistent  with  anat¬ 
omy  and  assumed  to  be  internal  to  the  CNS. 

Altogether,  the  looped  structure  and  the  anticipative  ability 
allow  both  the  model  to  invert  the  bio-mechanical  functions 
and  to  integrate  the  premotor  velocity  signals.  Notably, 
movements  of  the  same  amplitude  can  be  driven  at  different 
velocities,  and  the  time-courses  of  the  forces  exerted  by  the 
pneumatic  muscles  resemble  those  of  the  electromyograms  of 
real  muscles  during  arm  movements  at  various  velocities. 
Thus,  the  merging  of  an  ANN  designed  to  account  for  cell 
connectivity  in  a  cybernetic  model  grounded  on  functional 
principles,  enables  both  the  control  of  a  simple  robot  and  re¬ 
production  of  physiological  observations. 
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Fig.  8.  Internal  signals  during  a  fast  movement.  Except  for  the  top  right 
corner,  the  curves  plotted  in  dotted  and  solid  lines  represent  the  signals 
measured  on  side  1  and  2  of  the  circuit  shown  in  Fig.  IB  respectively. 


